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ABSTRACT 

Commonality analysis is a procedure for decomposing 
the coefficient of determination (R superscript 2) in multiple 
regression analyses into the percent of variance in the dependent 
variable associated with each independent variable uniquely, and the 
proportion of explained variance associated with the common effects 
of predictors in various combinations. Commonality analysis is an 
attempt to understand the relative predictive power of regressor 
variables, both individually and in combination. Despite their 
utility, these methods have not been used with great frequency, 
perhaps because these methods are not fully automated in commonly 
used statistical packages. This paper explores the applications of 
commonality analyses by reanalyzing data from a study of the 
relationships among anger and stress in predicting depression among 
247 undergraduate students. This data set is employed as a heuristic 
to make the discussion more accessible. In addition, a Statistical 
Analysis System (SAS) computer program procedure for obtaining all 
possible R superscript 2 values is discussed as an efficient method 
of implementing the required analyses. Two tables and one chart are 
appended. (Contains 13 references.) (Author/SLD) 



* * * -it it >V * it it it it it it it it it it it it it it it it it it it it it it it it it it >V it it it it it it it it it it it it it it it it it i: it it it it itieitititititititititititit it 

* Reproductions supplied by EDRS are the best that can be made 

* from the original document. ''f 

>'c it it itit it itit i: it it it it ititit itititit ititit it itit itititititit it itititititit it it it it it it it it it it it it itititit it it ititit it it it ititit it ititit it 



U.S. DEPARTMENT OF EDUCATION 

OHice or Educaiionai Research and improvement 
EDUCATIONAL RESOURCES INFORMATION 

7 CENTER (ERIC) 

!i/h.s document has been reproduced as 

received Jrom ihe person or organization 

originating tl 
r Minor changes have been made to improve 

reproduction quality 



Points o« view or opinions stated m thiSdOCU' 
ment do not necessar.iy represent oftic.at 
OEPi position Of policy 



••PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC)." 



Commonality Analysis for the Regression Case 



Kavita Murthy 
Texas A&M University 77843-4225 



Paper presented at the annual meeting of the American Educational Research 
Association, New Orleans, LA, April 6, 1994. 



B[STC8^YAVA!LAiLE 



Commonality Analysis 2 



ABSTRACT 

Commonality analysis is a procedure for decomposing R2 in multiple regression 
analyses into the percent of variance in the dependent variable associated with each 
independent variable uniquely, and the proportion of explained variance associated with the 
common effects of predictors in various combinations. Commonality analysis is an attempt 
to understand the relative predictive power of regressor variables, both individually and in 
combination. Despite their utility, these methods have not been used with great frequency, 
perhaps because these methods are not fully automated in commonly used statistical 
packages. This paper explores the applications of commonality analysis by re-analyzing 
data from a study of the relationships among anger and stress in predicting depression 
among undergraduate students. This data set is employed as a heuristic to make the 
discussion more accessible. In addition, a S AS computer program procedure for obtaining 
all possible R2 values is discussed as an efficient method of implementing the required 
analyses. 
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It has been increasingly recognized that discarding variance by converting 
intervally-scaled variables into nominally-scaled variables is not good research practice. As 
Kerlinger (1986, p. 558) explains, 

. . .partitioning a continuous variable into a dichotomy or trichotomy 
throws information away. . . To reduce a set of values with a 
relatively wide range to a dichotomy is to reduce its variance and thus 
its possible correlation with other variables. A good rule of research 
data analysis, therefore, is Do not reduce continuous variables to 
partitioned variables (dichotomies, trichotomies, etc.) unless 
compelled to do so by circumstances or the nature of the data 
(seriously skewed, bimodal, etc.). 
Kerlinger (1986, p. 558) notes tiiat the variance is the "stuff on which all analysis 
is based. Discarding variance by categorizing variables amounts to "squandering of 
information" (Cohen, 1968, p. 441). Pedhauzer (1982, p. 453) agrees tiiat, 
"Categorization leads to a loss of information, and consequentiy to a less sensitive 
analysis." 

Similarly, Humphreys and Fleishman (1974, p,468) note that categorizing variables 
in a non-experimental design using and ANOVA analysis "not infrequentiy produces in 
both the investigator and his audience the illusion that he has experimental control over the 
independent variable. Nothing could be more wrong." 

In fact, the practice of discarding variance on intervally scaled predictor variables to 
perform ANOVA, ANCOVA or MANOVA analyses creates problems in most cases. As 
Cliff (1987, p. 130) notes: 

Think of the persons near the borders. Some who should be highs are 
actually classified as lows and vice versa. In addition, tiie "barely 
highs," are classified the same as the "very highs," even though they 
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are different Therefore, reducing a reliable variable to a dichotomy 
makes the variable more unreliable, not less. 
Others (Cohen, 1968, 441) have noted that the use of ANOVA-type methods to 
analyze data reduces the reliability of most variables considered in the design, inflates Type 
n error probability, discards important information, and distorts the distribution shapes of 
and relationships among certain variables. These various realizations have lead to less 
frequent use of OVA methods, and to more frequent use of multiple regression (Elmore & 
Woehlke, 1988; Goodwin & Goodwin, 1985; Willson, 1982). 

Multiple regression is a statistical technique that offers a method for determining the 
weights that should be used to obtain the most accurate linear prediction of a criterion from 
several predictors (Allen & Yen, 1979). Researchers in the social sciences who utihze 
multiple regression methods in their data analyses typically examine only several aspects of 
multiple regression results. For example, they usually report the magnitude of a 
statistically significant multiple regression relationship in terms of the coefficient of 
determination (R^), or the extent to which variance in the dependent variable is "explained" 
by various predictors. 

It is somewhat rare for researchers to further decompose the R2 to determine unique 
and non unique contributions of the independent variables to prediction of the criterion 
variable (Siebold & McPhee, 1979). Commonality analysis offers a useful method for 
partitioning variance because it does not depend upon a priori knowledge of the influence 
of predictors. C ommonaHty analysis examines all possible orders of entry of the predictors 
into the model, and the predictors essentially fall where they may. Siebold and McPhee 
(1979) also argue that : 

Advancement of theory and the useful application of research findings 
depend not only on establishing that a relationship exists among 
predictors and the criterion, but also upon determining the extent to 
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which those independent variables, singly and in ail possible 
combinations, share variance with the dependent variable. Only then 
can we fully know the relative importance of independent variables 
with regard to the dependent variable in question, (p,355) 
The present pap«: explains how commonality analysis can be conducted using a 
particular S AS procedure and some simple computations. To make the discussion 
concrete, actual data involving undergraduates perceptions of stress and anger as it is 
related to depression are used for heuristic purposes. 

Purpose of Commonality Analysis 
The purpose of commonality analysis is to partition a squared multiple correlation 
into elements associated with each regressor variable and into elements associated with each 
possible combination of regressors. Commonality analysis generates these elements such 
that the sum of all elements equals the squared multiple correlation. It is also required that 
the sum of all elements associated with a single variable is equal to the squared simple 
correlation of that particular variable with the dependent variable. For the two-predictor 
case, these relationships can be expressed as: 

RVi2 = Ui + U2 + Ci2 
where R^y\2 is the squared multiple correlation of Y with variables one and two, Ui is the 
"uniqueness" or unique contribution of variable one to the squared multiple correlation, U2 
is the unique contribution of variable two, and C12 is the common element or commonality 
of variable 1 and 2, or the proportion of variance in Y predictable using either variable 1 or 
variable 2. As a result, for this case three variance components can be derived from the 
of the model, namely Ui, U2, and C12. 
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The number of possible unique and commonality components is exponentially 
detemiined and can be derived as the difference between the total number of components 
and the number of unique components, or (2P - 1) - P where p is the number of 
independent variables examined As such, the number of components or variance 
partitions increases very rapidly as additional predictors are considered. 

The rules for calculating the unique and commonality components are 
straightforward polynomial expressions developed by Mood and Wisler in 1969. 
However, as the number of independent variables increases, the complexity of the 
respective component calculations also increases. Table 1 presents the formulas for two, 
three and four variable models. Seibold and McPhee (1979) offer the formulas for a five 
variable model. 



Insert Table 1 about here 

Commonality analysis requires every possible value for all variable 
combinations. SAS provides a useful program (PROC RSQUARE) that will print out in 
ascending order the R2 values for all possible combinations of the independent variables in 
tiie model. This SAS routine makes commonality analysis much simpler, since the 
calculation of the required R2 values is fully automated 

The obtained R2*s are Uien used to determine all unique and common effects, by 
substituting them into tiie appropriate commonality formulas listed in Table L This can be 
easily accomplished by using any spreadsheet program to implement the ^propriate 
combinaiton of formulas. 

Once tiie variance components have been determined, tiie results can be tiien 
arranged in a summary table tiiat is easy to interpret and allows for inspection of tiie total 
variance associations witii each indq)endent variable (column totals) as well as specific 
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unique and common effects (row entries). Another check on the analysis is that the sum of 
all unique and commonality values should equal the value of the regression model when 
all the independent variables are entered into the model. 

Heuristic Example 

Data from a previous study involving the relationship between anger and stress in 
predicting depression among undergraduate students can be employed to illustrate the steps 
in the process of conducting a commonality analysis. The reader may want to refer to this 
study which was published by Clay, Anderson, and Dixon (1993) in the 
September/October issue of the Journal of Counseling and Development. Within this 
study, 247 undergraduates completed three paper and pencil questionnaires assessmg 
stressful life events, depression, and anger expression. The anger expression instrument 
yielded three different subscales; anger in (IN) anger out (OUT) and anger control 
(CONT), whUe the stressful life events instrument yielded only one score (STS). 

The results from this study concluded that anger in and stressful life events were 
significantly related to depression, and that anger out and anger control were not. Thus, 
the authors decided to eliminate anger out and anger control from further analyses and just 
focus on the other two variables (STS & IN). But because of the high degree of correlation 
between all of these predictor variables, (IN, OUT, CONT, &STS), commonality analysis 
can be used to determine the unique and common components of these variables so that a 
more accurate explanation in predicting depression can be obtained. 

The first step is to obtain the 15 equations necessary for computing the unique and 
commonality components of a 4-variable model. These equations are obtained from Table 
1. 

The next step is to then extract aU R2 values from the S AS printout (see appendix 
A) and substitute tiiese accordingly into tiie 15 equations. Appendix A presents the R2 
values for all possible combinations of the predictors in this data set. 
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Third, the appropriate formulas are then applied to the various R2 values using any 
calculator or spreadsheet program. For example: 

Ui (STS) = -R2(234) + R2(1234) 
= -.15977 + .30414 
= .14437 

Therefore, the unique contribution of the variable, stress (STS), to the proportioi 
of total dependent variance explained was . 14437, or approximately 14%. In addition, the 
commonality between stress (STS, Ui) and anger in (IN, U2) can be computed as: 
C12 (STS/IN) = -R2(34)+ R2(134) + R2(234) - R2(1234) 
=-.01953 + .20307 + .15977 - .30414 
= .03917 

Thus, the common variance of the model shared by stress (STS) and anger m (IN) is 
.03917, or approximately 4%. 

The fourth step is to arrange these obtained values into a commonality analysis 
summary table, like the one presented in Table 2. Once m tabular form, the previously 
mentioned checks on the data can be performed For example, summing down columns 
for each independent variable will yield the R^ of the regiession model in which that 
independent variable is the only variable entered into the model. Another check is that the 
sum of all unique md commonality values should equal the R2 value of the regression 
model when ali the independent variables are entered into the model. 



Insert Table 2 about here 
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The commonality summary table presented in Table 2 indicates that the unique 
predicted variance contribution of the predictor, stress, is approximately 14% (.14437) and 
its total commonality variance with one or more of the other predictors is approximately 5% 
(.04923). In this particular example, the variable, stress, is the dominant factor in 
predicting depression with college students. In addition, anger in, seems to be another 
helpful predictor of depression because it uniquely contributed approximately 10% 
(.10107) of the variance and its total commonality variance is approximately 4% (.03583). 
Consequently, the remaming variables, (anger out and anger control), offer little unique 
contribution to the variance (.00016, .01067, respectively). 

Also, some instances of negative commonalities may occur, as in this particular 
study with CI 24, C24, C234, CI 234, as reported in Table 2. This result can be *'counter- 
intuitive since the result could be taken to mean that . . . predictor variables have in 
conmion the ability to explain less than 0% of die variance" (Thompson, 1985, p. 54). 
However, the presence of negative commonalities is typically attributable to so-called 
suppressor effects, which might have been the case for this study (Beaton, 1973). 

Commonality analysis is an attempt to understand the relative predictive power of 
the rcgressor variables both individually and in combination. ITiis capability offers distinct 
advantages over more frequenUy used, traditional, types of aiialyses such as ANCOVA or 
stepwise regression. Commonality analysis is straightforward and easy to calculate when 
no more than four independent variables are involved, with the assistance of the S AS 
PROC RSQUARE procedure. As such, commonality analysis can be, and should be used 
more frequentiy in educational and social science research to partition the variance of the 
dependent variable into its constituent predicted parts. 
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Table 1 

Formulas for Unique and Commonality Components of Variance 

Two Independent Variables 

Ul =-R2(2) + R2(12) 
Ul =-R2(1) + r2(12) 

C12 = -R2(1) + R2(2)-R2(12) 
Three Independent Variables 

Ul =-R2(23) + R2(123) 
U2 = -R2(13) + R2(123) 
U3 = :R2(12) +R2(123) 

C12 = -R2(3) + R2(13) + R2(23) - R2(123) 
C13 = -R2(2) + R2(12) - R2(23) - R2(123) 
C23 = -R2(1) + r2(12) - R2(13) - R2(123) 

C123 = -R2(1) + r2(2) - R2(3) - R2(12) - R2(13) - r2(23) + R2(123) 
Four Independent Variables 

Ul =-R2(234) + r2(1234) 
U2 = -R2(134) + R2(1234) 
U3 = -R2(124) + r2(1234) 
U4 = -R2(123) + R2(1234) 

C12 = -R2(34) + R2(134) + r2(234) - r2(1234) 
C13 = -R2(24) + R2(124) + R2(234) - R2(1234) 
C14 = -R2(23) + R2(123) + R2(234) - R2(1234) 
C23 = -R2(14) + R2(124) + R2(134) - R2(1234) 
C24 = -R2(13) + R2(123) + R2(134) - R2(1234) 
C34 = -R2(12) + R2(123) + R2(124) - R^-(1234) 

C123 = -R2(4) + R2(14) + R2(24) + R2(34)- R2(124)- R2(134)- R2(234)+ R2(1234) 
C124'= -R2(3) + R2(1.3) + R2(23) + R2(34)- R2(123)- R2(134)- R2(234)+ R2(1234) 
C134 = -R2(2) + R2(12) + R2(23) + R2(24)- R2(123)- R2(124)- R2(234)+ R2(1234) 
C234 = -R2(1) + r2(12) + R2(13) + r2(14)- R2(123)- r2(124> R2(134)+ r2(1234) 
C1234 = -R2(1) + R2(2) + r2(3) + r2(4)- R2(12)- r2(13)- R2(14) -R2(14) - R2(23) - 
R2(24) - R2(34) + R2(123) + R2(124) + r2(134)+ R2(234) - R2(1234) 
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Table 2 

Commonality Analysis Summary Table 

i 2 3 4 

Component STRESS ANGER IN ANGER OUT ANGER 

CONTROL 



Ul 


.14437 








U2 




.10107 






U3 






.00016 












01067 


C12 


.03917 


.03917 






C13 


.00241 




.00241 




C14 


.00242 






.00242 


C23 




.00052 


.00052 




C24 




-.00452 




-.00452 


C34 






.0027 


.0027 


C123 


.00203 


.00203 


.00203 




C124 


-.00113 


-.00113 




-.00113 


C134 


.00451 




.00451 


.00451 


C234 




-.00006 


-.00006 


-.00006 


C1234 


-.00018 


-.00018 


-.00018 


-.00018 


TOTAL 


.1936 


.13690 


.01210 


.01440 


U 


.14437 


.10107 


.00016 


.01067 


c 


.04923 


.03583 


.01194 


.00373 
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Appendix A 

R>-Squares of Stress and Anger Expression as Predictors of Depression 



Number of Predictores in Mode! 



R-Square 



Variables in Model 



1 


.19360 


1 STRESS 




.13690 


2 ANGER IN 




.01210 


3 ANGER OUT 




.01440 


4 M\0 CONTROL 


2 


.29061 


12 




.20239 


14 




.19692 


1 3 




.15719 


24 




.14669 


23 




.01953 


34 


3 


.30398 


124 




.29347 


123 




.20307 


134 




.15977 


234 


4 


.30414 


1234 
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